44 research outputs found
Task-specific Word Identification from Short Texts Using a Convolutional Neural Network
Task-specific word identification aims to choose the task-related words that
best describe a short text. Existing approaches require well-defined seed words
or lexical dictionaries (e.g., WordNet), which are often unavailable for many
applications such as social discrimination detection and fake review detection.
However, we often have a set of labeled short texts where each short text has a
task-related class label, e.g., discriminatory or non-discriminatory, specified
by users or learned by classification algorithms. In this paper, we focus on
identifying task-specific words and phrases from short texts by exploiting
their class labels rather than using seed words or lexical dictionaries. We
consider the task-specific word and phrase identification as feature learning.
We train a convolutional neural network over a set of labeled texts and use
score vectors to localize the task-specific words and phrases. Experimental
results on sentiment word identification show that our approach significantly
outperforms existing methods. We further conduct two case studies to show the
effectiveness of our approach. One case study on a crawled tweets dataset
demonstrates that our approach can successfully capture the
discrimination-related words/phrases. The other case study on fake review
detection shows that our approach can identify the fake-review words/phrases.Comment: accepted by Intelligent Data Analysis, an International Journa
SAFE: A Neural Survival Analysis Model for Fraud Early Detection
Many online platforms have deployed anti-fraud systems to detect and prevent
fraudulent activities. However, there is usually a gap between the time that a
user commits a fraudulent action and the time that the user is suspended by the
platform. How to detect fraudsters in time is a challenging problem. Most of
the existing approaches adopt classifiers to predict fraudsters given their
activity sequences along time. The main drawback of classification models is
that the prediction results between consecutive timestamps are often
inconsistent. In this paper, we propose a survival analysis based fraud early
detection model, SAFE, which maps dynamic user activities to survival
probabilities that are guaranteed to be monotonically decreasing along time.
SAFE adopts recurrent neural network (RNN) to handle user activity sequences
and directly outputs hazard values at each timestamp, and then, survival
probability derived from hazard values is deployed to achieve consistent
predictions. Because we only observe the user suspended time instead of the
fraudulent activity time in the training data, we revise the loss function of
the regular survival model to achieve fraud early detection. Experimental
results on two real world datasets demonstrate that SAFE outperforms both the
survival analysis model and recurrent neural network model alone as well as
state-of-the-art fraud early detection approaches.Comment: To appear in AAAI-201
Spectrum-based deep neural networks for fraud detection
In this paper, we focus on fraud detection on a signed graph with only a
small set of labeled training data. We propose a novel framework that combines
deep neural networks and spectral graph analysis. In particular, we use the
node projection (called as spectral coordinate) in the low dimensional spectral
space of the graph's adjacency matrix as input of deep neural networks.
Spectral coordinates in the spectral space capture the most useful topology
information of the network. Due to the small dimension of spectral coordinates
(compared with the dimension of the adjacency matrix derived from a graph),
training deep neural networks becomes feasible. We develop and evaluate two
neural networks, deep autoencoder and convolutional neural network, in our
fraud detection framework. Experimental results on a real signed graph show
that our spectrum based deep neural networks are effective in fraud detection
LogGPT: Log Anomaly Detection via GPT
Detecting system anomalies based on log data is important for ensuring the
security and reliability of computer systems. Recently, deep learning models
have been widely used for log anomaly detection. The core idea is to model the
log sequences as natural language and adopt deep sequential models, such as
LSTM or Transformer, to encode the normal patterns in log sequences via
language modeling. However, there is a gap between language modeling and
anomaly detection as the objective of training a sequential model via a
language modeling loss is not directly related to anomaly detection. To fill up
the gap, we propose LogGPT, a novel framework that employs GPT for log anomaly
detection. LogGPT is first trained to predict the next log entry based on the
preceding sequence. To further enhance the performance of LogGPT, we propose a
novel reinforcement learning strategy to finetune the model specifically for
the log anomaly detection task. The experimental results on three datasets show
that LogGPT significantly outperforms existing state-of-the-art approaches
Ensino da lÃngua portuguesa na China: uma análise de alguns planos curriculares
Nos últimos anos, a China tem apostado fortemente no ensino do PLE com o
objetivo de reforçar as relações comerciais com os paÃses lusófonos. A fim de atender
às necessidades de intercâmbio, tem havido um grande crescimento e expansão do
curso de licenciatura em português nas instituições de ensino superior na China.
O presente trabalho apresenta um estudo do plano curricular dos cursos de
licenciatura em português na China Continental e tenta fazer uma análise de alguns
planos curriculares desses cursos, em particular, o da Universidade de Estudos
Internacionais de Xi’an em comparação com o da Universidade de Macau, tendo
como objetivo acompanhar o desenvolvimento atual do ensino de português no
contexto chinês e identificar os problemas existentes.In recent years, China has invested heavily in education PFL aiming to
strengthen trade relations with portuguese-speaking countries. In order to meet the
needs of exchange, more and more chinese universities have started undergraduate
courses of Portuguese.
The present work relates to the development of the curriculum of the bachelor
courses in Portuguese, in Mainland China, and it tries to make an analysis of these
courses curriculum, specially regarding the case of the Xi’an Internacional Study
University and the University of Macao; the purpose of this work is the analysis of
the current development of the teaching of Portuguese in the Chinese context and
identifying the existing problems
Robust Fraud Detection via Supervised Contrastive Learning
Deep learning models have recently become popular for detecting malicious
user activity sessions in computing platforms. In many real-world scenarios,
only a few labeled malicious and a large amount of normal sessions are
available. These few labeled malicious sessions usually do not cover the entire
diversity of all possible malicious sessions. In many scenarios, possible
malicious sessions can be highly diverse. As a consequence, learned session
representations of deep learning models can become ineffective in achieving a
good generalization performance for unseen malicious sessions. To tackle this
open-set fraud detection challenge, we propose a robust supervised contrastive
learning based framework called ConRo, which specifically operates in the
scenario where only a few malicious sessions having limited diversity is
available. ConRo applies an effective data augmentation strategy to generate
diverse potential malicious sessions. By employing these generated and
available training set sessions, ConRo derives separable representations w.r.t
open-set fraud detection task by leveraging supervised contrastive learning. We
empirically evaluate our ConRo framework and other state-of-the-art baselines
on benchmark datasets. Our ConRo framework demonstrates noticeable performance
improvement over state-of-the-art baselines.Comment: 16 pages, 5 figures, and 3 table